Stationary policies and Markov policies in Borel dynamic programming

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantized Stationary Control Policies in Markov Decision Processes

For a large class of Markov Decision Processes, stationary (possibly randomized) policies are globally optimal. However, in Borel state and action spaces, the computation and implementation of even such stationary policies are known to be prohibitive. In addition, networked control applications require remote controllers to transmit action commands to an actuator with low information rate. Thes...

متن کامل

Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes

This paper studies a discrete-time total-reward Markov decision process (MDP) with a given initial state distribution. A (randomized) stationary policy can be split on a given set of states if the occupancy measure of this policy can be expressed as a convex combination of the occupancy measures of stationary policies, each selecting deterministic actions on the given set and coinciding with th...

متن کامل

Regular Policies in Abstract Dynamic Programming

We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of regular policies. These are policies that are well-behaved with respect to value and policy iteration, and are patterned after proper policies, which are central...

متن کامل

Efficient Policies for Stationary Possibilistic Markov Decision Processes

Possibilistic Markov Decision Processes offer a compact and tractable way to represent and solve problems of sequential decision under qualitative uncertainty. Even though appealing for its ability to handle qualitative problems, this model suffers from the drowning effect that is inherent to possibilistic decision theory. The present paper proposes to escape the drowning effect by extending to...

متن کامل

Planning structural inspection and maintenance policies via dynamic programming and Markov processes. Part I: Theory

To address effectively the urgent societal need for safe structures and infrastructure systems under limited resources, science-based management of assets is needed. The overall objective of this two part study is to highlight the advanced attributes, capabilities and use of stochastic control techniques, and especially Partially Observable Markov Decision Processes (POMDPs) that can address th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Probability Theory and Related Fields

سال: 1987

ISSN: 0178-8051,1432-2064

DOI: 10.1007/bf01845641